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1. INTRODUCTION 

The strategic goal of modern earth sciences research is to model the structure, functioning, dynamics 
and development of natural, social, industrial systems and their interaction to optimize the economic 
development of regions, minimize the manifestation of destructive geo-ecological processes, predict natural 
and natural-man-made situations in the geographic shell on the global, regional and local levels. 
The geographic shell is considered as a geosystem—“a special class of control systems; the earth space of all 
dimensions, where individual components of nature are in systemic connection with each other and, as a certain 
integrity, interact with the cosmic sphere and human society” [1]. The geosystem approach progress in the 
second half of the XX-early XXI century led to the formation of four vectors of scientific research: i) 
morphological—diagnostics of the elements of systems and structural relations between them, ii) process— 
analysis of flows of dynamically interconnected cycles and metabolism of matter and energy in the geographic 
shell, iii) paragenetic—the study of the interaction of processes of metabolism of matter and energy with the 
structure of geographic objects that gives a basis for the conclusion about the origin and development of 
geosystems, and vi) study of total (natural-social-production) geosystems for complex geo-diagnostics 
(monitoring) of the development of ecological-socio-economic processes. 
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The functioning of geosystems can be represented as a set of movement, exchange and transformation 
of energy, substance and information processes between its elements and the surrounding geographic space. 
The leading processes are the transfer and transformation of solar energy, moisture circulation, litho logical 
and geochemical cycles, and biological metabolism. The development of gyres is interconnected in nature; 
imparts integrity effects to geosystems, forming spatio-temporal patterns of the geographic shell differentiation 
at different levels of organization. The determining factors for the formation of the spatial structure, 
development, dynamics and functioning of geosystems are: neotectonic movements, the composition of rocks 
in the zone of free water exchange, climatic conditions and the regime of surface waters, and technogenic 
impacts. The purpose of this article is to develop new algorithms for analyzing the metageosystem model of 
the territory for analyzing the state of lands in order to assess the water balance of the territory based on the 
methods and technologies of machine learning. 


2. RELATED WORK 

Geographical researches at the turn of XX—XXI centuries show that synthetic maps of geosystems 
should be considered as the central link in GIS [2]. The essence of this approach lies in the objective existence 
in nature of interconnected combinations of geo-components that form geosystems. Their relative homogeneity 
presupposes the same type of economic development and usage. 

The general scheme for compiling geosystems electronic map in the regional GIS "Mordovia" is 
implemented by solving the following tasks: i) collecting and preparing of thematic maps and databases system, 
ii) systematization of information with the construction of a hierarchy of geosystems, iii) ensemble analysis of 
multispectral space images with the construction of geosystems synthetic map, vi) evaluation of simulation 
results, and v) obtaining and practical use of spatial information, as shown in Figure 1. 
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Figure 1. Algorithm for electronic map of geosystems compiling 

The collection and preparation of thematic maps and databases system is based on factual materials 
of digital spatial data infrastructures (SDI) and earth remote sensing data (ERSD) [3]. SDI of the geographic 
information system (GIS) "Mordovia" includes electronic maps of the basement and sedimentary cover 
tectonics; manifestations of the latest and modern tectonic movements, bedrocks and quaternary sediments, 
groundwater hydrodynamics and hydrochemistry, morphometric (slope, height, profile camber) and 
morphological relief characteristics, climate dynamics and hydrological regime of surface waters, soil and 
vegetation cover structure, land usage, geotechnical systems and the density of these objects. After revision, 
the initial information is structured to form normalized datasets suitable for training and testing machine 
learning algorithms, forming and testing models. 

The carried-out work experience shows that it is advisable to interpret geosystems from a small-scale 
image to a large-scale one, with increasing information about the diagnostic features of natural complexes. The 
use of photographs of varying degrees of generalization contributes to the establishment of regularities in 
geosystems spatial-temporal organization, increases the reliability of interpretation, and contributes to a more 
accurate interpretation of diagnostic features. 

Information systematization with the hierarchy of geosystems GIS "Mordovia" construction is guided 
by the identification of genetically homogeneous, territorially adjacent formations, isolated under the influence 
of a certain mode of the spectrum of geographic processes: systems, classes, groups, types, genera and types 
of geosystems [4]. The state and properties of each territorial unit, are determined from the standpoint of the 
geosystem approach by the peculiarities of its interaction with neighboring objects of the same hierarchical 
level, the characteristics of the enclosing geospatial system of a higher level, as well as the interaction of objects 
of a lower hierarchical level that make up the analyzed territory. Based on this, it can be assumed that the 
accuracy of the classification of geosystems based on the remote sensing data can be increased if the classifying 
model takes into account and analyzes not only the properties of a particular territory, but also the characteristic 
features of the geosystems with which it interacts and, in particular, to which it belongs. 

Effectively training and using machine classifiers in geospatial data analysis faces a number of 
challenges that need to be solved. The most important ones include: the need to solve problems of deep models 
training in conditions of a deficit of labeled data; overcoming the problem of intrinsic complexity of images 
obtained from remote sensing data (RSD), as well as determining model hyper parameters when analyzing 
complex spatial data [5]-[7]. Deep models are capable of learning more features, but they are highly susceptible 
to retraining problem [8]-[10]. At the same time, an important place is occupied by the problem of adaptation 
of individual classifiers (mono classifiers) to a new data set, which is relevant in the context of additional 
training of a deep model for classifying lands of a new spatial area to increase the profitability and speed of 
work carried out in the field of RSD machine analysis [11]-[13]. The designated problems solution can be 
approached by combining individual classifiers into ensembles [14]. Research results show that combining 
classifiers into a system improves the classification algorithm sustainability [15]. 


3. MATERIALS AND METHODS 
3.1. Geosystem approach to data preparation 

Let us present a description of a data model that allows to characterize a territory from the perspective 
of a geosystem approach in order to subsequently solve the problem of classifying geosystems using machine 
learning models that can efficiently analyze this data. By classification we mean the operation f, performed by 
the model M, with experience E that allows us to correlate the specific class label Y with a local object 
characterized by a set of parameters xzoca) and direct interconnection with metageosystems determined by the 
vector of properties xmc: 


YAX Local» Xmc)s M, E) (1) 


If xm is an empty set, we consider the case of classification without involving geosystems data. The set of 
characteristics of the local object xzocai is formed on the basis of RSD and can be of different formats. Thus, a 
territory can be assigned to a class based on pixel-based classification or by extracting features from territorial 
fragments of different sizes (patch-based classification) [16]. In addition, the territory data are characterized 
by resolution: spatial, spectral, and radiometric [17]. The set of characteristics of a local object can be packed 
itself into tensors of various dimensions, determines the level Lo of the formed geospatial model of the territory. 

From the geosystem approach standpoint, the territory properties are significantly influenced by the 
enclosing geosystem. RSD is a source of information about it. But if stringent requirements are imposed on the 
Lo data level about the xug object (data must be obtained at the strictly necessary time and have a sufficiently 
high resolution) and, as a result, they are quite expensive, then the requirements for the L, data level and higher 
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can be weakened by providing simplification and reducing the cost of the obtaining process. Currently, many 
providers openly provide RSD of medium and high spatial resolution, offering application programming 
interfaces (API) for obtaining them. The fact that this data has a low temporal resolution is the reason for its 
low cost. At the same time, they do not cease to be an informative source of xmg about the enclosing geosystems 
of various hierarchical levels. 

To solve the problem of classification of geosystems for assessing the water balance of the territory, 
it is advisable to use images from the Sentinel-2 satellite. For the purpose of primary testing of the proposed 
methodology, an open EuroSAT dataset was used, formed for training and testing machine learning models in 
order to effectively solve the classification problem. The dataset is evenly divided into 10 classes and consists 
of 27 thousand images containing information on land plots in the European Union in 13 spectral ranges. Each 
dataset element is 64x64 pixels in size with a spatial resolution of 10 m per pixel and is also georeferenced. 

The process of levels Li, L2, ..., Ly data obtaining is potentially subject to full automation: having 
information about the classified area geographical coordinates (latitude and longitude), it can make a request 
to the API of the spatial data provider to obtain a fragment of the territory space image with these coordinates 
of the required scale and resolution. Thus, there is a possibility of algorithmic expansion of the training data 
set by importing fragments of space imagery that characterize metageosystems of a higher hierarchical level 
and contain the classified area. Data of Lı, L2, ..., Ly levels can be automatically obtained as fragments of open 
satellite images in the visible spectral range, automatically received from the provider of online maps MapBox 
via an application programming interface at the different zoom-level (scale of displaying tiles: 8, 12 and 14). 

Not only can certain scale RSD characterize different hierarchical levels geosystems. This role can be 
successfully played by synthetic digital maps. Let us formulate a hypothesis that electronic landscape maps 
and other thematic maps (land cover, land use), that traditionally represent the final artifact of analyzing and 
classifying RSD process, carry a significant amount of information about the territories properties included 
and, as a result, can be used for forming input tensors of additional information in the xmc set. These maps 
often have a relatively low resolution, but their significant degree of abstraction suggests that they have a good 
potential for enriching information about a classified area of a small size, located on the geosystems territory, 
distinguishable only on a smaller scale. Synthetic digital maps form another layer of the geospatial territory 
model Ls and become another source for expanding the xmg auxiliary dataset. 


3.2. Designing of neural networks ensembles for data classification 

Individual classifiers can be integrated into ensembles: research results show that combining 
classifiers into a system improves the classification algorithm sustainability. The ensemble of classifiers key 
system components can be a set of individual mono classifiers of various architectural organizations and a meta 
classifier-a module that receives data from mono classifiers for the subsequent adoption of a resulting decision 
on the belonging of the analyzed spatial territorial object to any class. Let us give a description of the algorithm 
that determines the variant of ensemble training, according to which the following tasks are sequentially solved: 
i) training of individual mono-classifiers, ii) assessment of the accuracy of each individual mono-classifier, 
and iii) training the meta-classifier that analyzes the decisions made by the mono-classifiers of the ensemble, 
taking into account measured voting: 


H = argmaXxcec Oey w(i,c)V(i,c)), 
V(i,c) = 1,if the classifier i pickc class c, (2) 
V(i,c) = 0,if the classifier i rejects class c, 


where w (i, c) is a weight coefficient characterizing the efficiency of the i classifier in detecting territorial 
objects of class c from the general set of classes C; V (i, c) is a logical variable describing the fact that the i 
classifier classified the territorial object as class c; N is the total number of ensemble mono-classifiers. Fg-score 
metric can be used as a weighting factor for efficiency, since there is a value that depends on accuracy and 
sensitivity. 


F.(i) = (1 +P) precision-recall (1+")TP(i) (3) 


P precision) +recall = (14+7)TPc(i) +P FN¢(i)+FP(i) 


7 TP,(i) — number of correctly classified territorial objects of class c; 
-  FP,(i) — type I error for territorial objects of class c; 
-  FN,(i) — type II error for territorial objects of class c. 
If we take the value 8 = 1, the metric will take then the value of the harmonic mean of sensitivity and 
accuracy (F-score) [18]. The methodology for calculating the efficiency ratio can be developed by introducing 


Development of the regional water balance regulation concept based on the ... (Anatoliy A. Yamashkin) 


1676 O ISSN: 2502-4752 


the concept of the threshold of inefficiency. At the same time, the formula for calculating the weight coefficient 
for i ensemble classifier, typical for the definition of territorial objects of class c, will take the following form: 


0,ecau F.(i)— € < 0, 


WUS = (F.(é) — £), ecau F.() — € > 0, >” 


where € is the inefficiency threshold. 

The value of the proposed weighting metric will be equal to one in the case of an ideal classifier and 
to zero if the quality of the classification goes beyond the inefficiency threshold £. In general, the presented 
efficiency metric that depends on sensitivity and accuracy determines the model's ability to correctly classify 
objects of a particular territorial class, avoiding a high number of errors. If the parameter £ is taken equal to 
0.5, the hypotheses of the mono-classifiers, which give a result with a guessing accuracy, will be discarded. 

It is proposed to design ensemble mono-classifiers according to the following algorithm to find the 
optimal solution: 

a. The formation of a system of requirements for the model: definition of inputs and outputs, performance and 
accuracy characteristics. 

b. Defining the basic architecture of the model based on the block approach that describes the general 

organization of the classifier. 

Blocks decomposition into sequential or branching structures. 

d. Managing of reducing the accuracy of the retraining classification problem by heuristically configuring the 
hyperparameters of the deep model and adding layers of normalization, subdecritization and regularization. 

e. Optimization of the model according to the principle "small is better than big": excessively deep neural 
networks are prone to retraining. 

f. Training the model with testing various measures of accuracy, optimization algorithms, loss functions, and 
the number of training epochs. 

g. The model training process analysis by calculating the dependence of the mathematical expectation and the 
standard deviation of the classification accuracy on the training epoch based on a series of experiments. 

h. The quality of the obtained solution assessment based on the construction of error matrices and the 
calculation of accuracy and error metrics based on the results of the model. 

i. The correspondence of the obtained model to the objective and subjective imposed requirements 
conclusion. 

The proposed chain of actions leads to obtaining a mono-classifier with certain properties, the search 
process can be completed if the result is satisfactory. If the parameters of the designed and trained model do 
not meet the requirements, it is necessary to roll few steps back along the trajectory of a deep model creating 
(up to the first stage, if the formulated requirements turned out to be unattainable) and repeat the search in the 
heuristically adjusted direction. As a result, the process of searching for an effective classification model can 
be formalized in a tree form; the root node of it precedes the first stage of the search algorithm and corresponds 
to solving the problem of formulating the research problem [19]. The tree nodes determine the variant of fixing 
the state of the model at the i stage of the algorithm for finding an effective model. Terminal nodes of the tree 
correspond to a particular solution to the problem of finding an optimal model ready for using a deep classifier. 

The obtained particular solutions can be compared by split testing means based on the comparison of 
objective numerical metrics of the models’ efficiency and subjective expert assessment of the classification 
quality. The process of searching for classification model for the hierarchy of metageosystems can be 
formalized in the form of a tree, its root node precedes the first stage of the algorithm for searching the patterns 
of regional differentiation. The tree nodes determine the variant of fixing the state of the model at the i stage 
of the algorithm for finding an effective model of regional metageosystems. Terminal nodes of a tree 
correspond to a particular solution to the problem of finding an optimal model of metageosystems ready for 
using a deep classifier. 


© 


3.3. Classification of the metageosystem model of the territory 

To classify data the territory metageosystem model, presented in the form of a tuple (Xz cai, XuG)> a 
deep neural network was proposed. It accepts data tensors of various hierarchical levels about the classified 
territory (Lo) and the geosystems containing it (Lı, Lo, ..., Ly, Ls), and returning the hypothesis that a given 
territory belongs to a certain class. This model can be used both separately and in combination with other 
mono-classifiers. From the point of “black box” view, the described deep classification model is based on the 
application of the geosystem approach; it is a functional element that accepts as input territory images (Zo) and 
its host geosystems (Li), obtained on the basis of satellite imagery, as well as synthetic maps (Ls). The number 
of inputs can vary, based on the number of levels of the metageosystem model of the territory. However, their 
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growth should be treated with caution, since this will inevitably lead to an increase in the capacity of the model. 
The model has one output in the form of a vector, and each i element determines the predicted probability of 
the territory belonging to the i class. The final hypothesis of a territory belonging to a certain class is put 
forward according to the “winner takes all” principle, when the object belongs to the class for which the model 
predicts the maximum probability. 

For the primary identification of signs based on the data of each input L,€(Lo,...,;,...,L5) introduce 
UnitL, block that extracts hierarchical features Fx; of different levels i=1,N from the original image Ly. The 
UnitL, block is decomposed into N feature extraction units, and each of it has an external output. The structure 
of each block represents a chain of layers. The first layer performs the operation of deep separable convolution; 
it makes possible to extract features from the original image, and also, in contrast to the use of a conventional 
convolutional layer, to make the model more compact and, accordingly, resistant to the retraining [20]. 

The underlying operation of the layer is a two-dimensional convolution operation with a kernel W of 
K size represents a linear transformation in which each value y;; of the output matrix Y is calculated based on 
the x values of the original matrix X: 


Ji = WX =V 02b = 0 Wa bXitaj+b (5) 


the convolution operation preserves the structure and geometry of the input, is characterized by sparseness and 
multiple use of the same weights—important properties. The next layer of the feature extraction block (its 
efficiency has been tested experimentally) is the batch normalization layer [21]; this makes possible to achieve 
regularization and stability of the model. To perform the activation operation, the function ReLU was selected, 
which performs a transformation of the form x=max(0, x). At the end of the feature extraction block is a 
subdescritization layer that applies the maximum operation to reduce the size of the resulting representations 
and has external outputs [22]. The subdescritization operation applied to the elements x;,; of the original matrix 
X results in the matrix Y; the value of each element yj; at the size of the sub-description window d is calculated 
according to the expression: 


Vij = Gees Xitaj+b (6) 


0<b<d 


in the experiments, the application of the operation of taking the maximum showed the best result. Note that it 
is proposed to choose the number of output filters in the convolution and the size of its kernel according to the 
principle of minimizing these values with an acceptable classification accuracy maintaining [23]. 
The next component block of the described model is the feature fusion module. It accepts as input the N-level 
features extracted from the images of the classified area and associated geosystems images. Also, the second 
and subsequent level merge modules accept the output of the previous merge module as input. The total number 
of feature fusion modules is equal to the number of levels for extracting hierarchical features in unit Ly blocks. 
All input data are concatenated into a single tensor and processed using a feature extraction pipeline, which 
has a structure similar to that used in the unit Ly module: it consists of depth separable convolution layers, batch 
normalization, activation and sub-description [24], [25]. In this case, the number of output filters in the 
convolution for the Merge Hy block is proposed to be chosen larger than the filter dimension when extracting 
the characteristic features of the corresponding level N in the unit Ly module. 

The output of the last feature fusion module is converted into a vector and fed to the input of the 
multilayer perceptron. The number of tightly connected layers of a multilayer perceptron and their thickness 
are selected according to the principle of minimizing these parameters while maintaining sufficient 
classification accuracy. In addition, to solve the retraining problem, it is recommended to apply batch 
normalization and decimation to the outputs of the tightly coupled layer. To activate the output of the input and 
hidden layers, the ReLU function is selected, the output layer—sigmoid (for binary classification) and softmax 
(for multiclass). 


4. RESULTS AND DISCUSSION 
4.1. Testing the proposed algorithm 

To test the proposed technique, images from the Sentinel-2 satellite were used. The classification 
accuracy for the test dataset based on different ratios of training and test samples is shown in the Table 1. 
The neural network model ResNet-50 shows the classification accuracy of 96.43% with a ratio of training and 
test data of 80/20 and 75.06% with a ratio of 10/90; a small convolutional network in two layers achieves an 
accuracy of 87.96% at an 80/20 ratio and 75.88% at a 10/90 ratio. Note also that deep machine learning models 
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based on convolutional layers showed predominantly greater accuracy than support vector machines, as shown in 
Table 1. 


Table 1. Classification accuracy, % 
The ratio of training and test data 


Monel Number of units jog 20/80 30/70 40/60 50/50 60/40 70/30 80/20 90/10 
CNN (iea coa) 422 378 75,88 79,84 81,29 83,04 84,48 85,77 87,24 87,96 88,66 
ResNet-50 25 636 712 75,06 88,53 93,75 94,01 94,45 95,26 95,32 96,43 96,37 
GoogleNet 6 797 700 77,37 90,97 90,57 91,62 94,96 95,54 95,70 96,02 96,17 
Our Model 1 324 526 86,23 91,52 93,98 94,11 94,29 94,35 94,41 94,65 95,30 


Thus, modern deep convolutional networks provide excellent classification accuracy of satellite images 
with a relatively large training sample size of the dataset, however, under conditions of a lack of training data, the 
presented approaches begin to lose significantly in accuracy. The problem of increasing the accuracy of methods 
and algorithms for analyzing spatial data in conditions of their deficit remains relevant. 

To expand the set of training data, an algorithm has been developed that allows, according to the 
coordinates to which an element of dataset is attached, to load images of the territory of various scales from the 
MapBox using the API. Thus, the basic data set (level Lo) was extended with additional levels of information 
without significant costs. The final extended dataset has the following structure: 

- level Lo - tagged data of the Sentinel-2 ERS images (images with a size of 64 x 64 pixels in the visible 
spectral range, natural colors) of regional metageosystems. The training and test samples are split 10/90 to 
simulate a data scarcity situation. 

- levels Li, L2, Ls - fragments of open satellite images in the visible spectral range, automatically received 
from the provider of online maps MapBox via an application programming interface at a zoom level of 
displaying tiles 8, 12 and 14, respectively. Expansion of the original dataset led to the fact that each classified 
area is represented by four images of the territory of different scales. 

Extension of the original dataset led to the fact that each classified area is represented by four images of 
the territory of different scales. It is of interest to analyze the learning process of the presented model of proposed 
model. Training of neural networks is a probabilistic process, therefore, a series of 10 experiments was carried 
out. The designed model in the early stages of training shows low accuracy of the extended set classification, but 
the accuracy begins to grow almost from zero; the lightweight convolutional neural network in two layers and the 
ResNet50 model from the first epoch achieve an accuracy of more than 40%. However, after the 10th learning 
epoch, the proposed solution outperforms other models, reaching the expected accuracy of 86%. We also note a 
small standard deviation from the mathematical expectation inherent in proposed model when training on a small 
dataset. This indicates the stability of the learning process of the model and the ability to correctly generalize 
information about the analyzed features. Thus, the expansion of the initial dataset from the standpoint of the 
geosystem approach and the development of a model that allows analyzing this set made it possible to improve 
the classification accuracy in conditions of a shortage of training data (dividing the training set into training and 
validation in a ratio from 10/90 to 40/60) and show the results exceeding the accuracy of deep machine learning 
models when classifying the test dataset. 


4.2. Assessment of the water balance of the territory 

The spatio-temporal structure of the geosystems of Mordovia is determined by the geographical position 
in the system of subboreal semiarid geosystems of the strata tier Privolzhskaya Upland and the stratal Oka-Don 
Lowland, which is expressed in the functioning of forest-steppe landscapes and genetically, territorially associated 
with them, forest, meadow, boggy and others. System (subsystem) of landscapes. 
The radiation balance for the year is 1638.29 MJ/m?, in December it is equal to minus 16.76 MJ/m?, in June- 
339.39 MJ/m?. The annual sum of direct solar radiation entering the surface perpendicular to the sun rays is 
3536.36 MJ/m?, in June it is equal to 536.32 MJ/m?, and 8.38 MJ/m?—in December. 

The average air temperature per year is 3.5—4° C. In the annual course, it varies from 19-19.8° C in July 
to minus 12.2—11.7° C in January. The average minimum temperature of the coldest month on the territory under 
consideration is minus 16.4° C, the average maximum temperature of the warmest month is 26° C. Relative air 
humidity varies from 84—86% in December—January to 62-64% in May-June and in the average for the year is 
16-77%. 

On average, 598-636 mm of precipitation falls per year; 221—252 mm-—during the cold period from 
November to March, and 337-391 mm-during the warm period from April to October. On average, there are 70 
days with liquid precipitation (342 mm) per year. In the long-term regime, the geosystems functioning is 
expressed in the following water balance quantitative characteristics, see Table 2. 
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Table 2. Average annual long-term water balance of Mordovia 


Rainfall Fullriver Surface Underground Evaporation Gross Coefficient Sub- earthly runoff 
(P) runoff (R) runoff (S) component (U) (Œ) humidification (W) R/P share (U/R), % 
mm km? mm kmê mm km? mm km? mm km? mm km? 
620 16,6 111 2,90 8l 2,12 30 0,78 509 13.3 539 14,1 0,18 27 


The main source of water resources is atmospheric precipitation that not only form river and surface 
runoff, gross soil moisture, evaporation, but also provides groundwater reserves. Class (subclass) of geosystems. 
The border of the Volga Upland and the Oka-Don Lowland coincides mainly with the Mesozoic-Paleogene 
Ulyanovsk-Saratov syneclise. In place of the Paleozoic and even Mesozoic-Paleogene troughs in the Neogene- 
Quaternary period, arched uplifts were formed, and the Volga Upland created. The Oka-Don lowland is formed 
by morph structures of a transitional type — the latest uplifts or subsidences have covered large areas of the earth's 
crust with a heterogeneous tectonic structure. 

The residual-watershed massifs of the erosion-denudation plain are crowned with relics of the Oligocene 
flattening surface with average heights of 280-320 m. The depth of the erosional incision reaches 100-120 m. 
The minimum absolute heights are noted in the Sura valley (89 m). The Lower Paleocene sediments, represented 
by fractured flasks, sandstones, marls, sands, and clays, overlap the upper cretaceous carbonate and terrigenous 
aquifers everywhere, often forming a single aquifer. The maximum thickness in the most complete sections 
reaches 60-90 m. The thickness of the flooded rocks is from 20-25 m in the central parts of the interfluves, and 
decreases to complete wedging out at the base of the slopes. 

The erosion-denudation plain passes into the secondary moraine plain of the marginal part of the Volga 
Upland with a scarp up to 80 meters high. This geomorphologic province is distinguished by a gentler slope, less 
dense ravine dissection. Quaternary sediments and bedrocks form aquifers that often do not coincide in area of 
distribution, and in the case of successive occurrence, as a rule, do not have insulating waterproofing between. 

The general structure of lithohydrogenic systems is determined by the following hydro geological 
conditions: 

a. Quaternary sediments groundwater is characterized by fragmented distribution and uneven filtration 
properties of its aquifers, their thickness and hydrodynamic properties. 

Albian aquifer, up to 27 m thick, sand filtration coefficient varies from 0.7 to 10 m/day. 

c. Oxford-Kimmeridgian water confinement of variable thickness-from 20 to 240 m. 

d. Bath-Callovian aquifer from 40—45 m decreases eastward, as the water-bearing sands are replaced by clays, 
until complete wedging out near Insar river valley; sand filtration coefficient varies within 0.03—2.5 m/day, 
averaging 1.1—1.2 m/day. 

e. The Bajocian and in some places overlapping them the Lower Bathsian waterproofing, composed of dense 
fat clays of relatively constant thickness-from 5-8 to 12-15 m. 

f. The water-glacial plains of the Oka-Don lowland have absolute elevations up to 180 m with a general slope 
towards the valleys of medium and small rivers. They are characterized by wide watersheds—up to 8-10 km, 
gentle and slightly dissected slopes. The depth of the erosion incision does not exceed 30—40 m. 

g. The aquiferous Carboniferous-Permian carbonate complex is of particular importance in the functioning of 
the lithohydrogenic geosystems of ancient runoff troughs. The thickness of the flooded strata with fresh 
waters ranges from 20 to 250 m. The filtration coefficient varies within significant limits—from 3.3 to 80 
m/day. The model layer has the highest filtration properties in the area of structural uplifts. 

h. River valleys. Most of the river valleys in Mordovia are tectonic in nature, which is manifested in their 
rectilinear pattern. Structural lines are often zones of groundwater discharge, increased activity of many 
geoecological processes associated with the geological environment: karsts, landslide, suffusion and others. 

Groups (subgroups) of geosystems. The formed database on water showings includes information on 3,315 

sources; 2,808 springs; 865 hollows and bogs; 1,370 landslide zones with discharge of interstratal and ground 

waters. The types (subtypes) of geosystems, distinguished by the structure of the soil cover and vegetation 
features, largely determine the features of the economic development of the territory. 

The peculiarities of the interaction of zonal and azonal factors determine the functioning of the following 
types of geosystems: 

a.  Broad-leaved forests of erosion-denudation plains with gray forest rubble soils. 

b. _ Broad-leaved forests of near-watershed areas of secondary moraine plains with gray loamy soils. 

c. Meadow steppes with dominance in the structure of the soil cover of podzolic black soils, leached black 
soils, and meadow black soils. 

d. Mixed forests of water-glacial plains with gray, light gray forest and sod-podzolic soils of sandy loam and 
light loamy texture. 
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In the meadow-steppe complexes, significant tracts of arable landscapes with a sparse network of large 
agricultural settlements have formed, while the geosystems of broad-leaved forests are characterized by smaller 
tracts of tillable land, occupying relatively small areas of near-valley slopes with the most fertile soils. This area 
is characterized by a denser network of settlements, but they are small in terms of population. In the group of 
landscapes of mixed forests, a common feature of the cultural landscape is focal agricultural and forestry 
development with sparsely populated rural and forest settlements. 

The genus (subgenus) of geosystems, the spatial structure of which determines the allocation of frame 
elements of different-level zones of ecological balance for the functioning of the economic framework; serves as 
the main territorial carrier of information on the formation of the water regime for the concept of region water 
balance regulation development. 

a. Upland massifs composed of strongly fractured siliceous-carbonate rocks (opokas and marls with 
diatomite’s lenses, tripoli, and sands) of the Paleogene. Highland oak forests with gray forest, to varying 
degrees, rubble soils dominate. A single-layer filtration area open drainage water flows with an overflow 
into the underlying aquifer is characteristic. The aquifer is fed by atmospheric infiltration, intensive 
unloading occurs through the local ravine and ravine network. Geosystems are characterized by a high 
infiltration potential—120—160 mm per year, that is somewhat reduced on agricultural lands, where planar 
and linear erosion is actively manifested in the conditions of rugged relief. 

b. Near-watershed spaces composed of fractured carbonate rocks (chalk and marl), upper cretaceous sands and 
sandstones, overlain by thin deluvial loams. Landscapes of deciduous forests with gray forest soils and black 
soils are selectively developed. Single-layer filtration of open drainage water streams is typical. High 
infiltration potential-120—160 mm per year determines the formation of the groundwater recharge area. The 
calculated filtration coefficient of bedrock varies from 0.4 to 7.4 m/s, averaging 0.6-0.8 m/s. 
The presence of impervious horizons in the base determines the active discharge of groundwater. 
The groundwater discharge zone is characterized by the spread of small relict peat bogs and the presence of 
excessively humid geosystems. Geosystems are weakly resistant to technogenic loads; during the 
landscape’s development, planar and linear erosion is activated. 

c. | Thenear-watershed areas of the secondary moraine plains are characterized by the dominance of forest types 
of landscapes, selectively developed, characterized by a relatively high infiltration potential - 20-40 mm per 
year. The characteristic features of the functioning of lithohydrogenic geosystems are a noticeable number 
of discharge storeys of interstratal groundwater on the sides of gullies and bedrock slopes of small river 
valleys, the formation of hollow-type springs, and the spread of landslide forms with a significant variation 
in their density and size. 

d. Central parts of river basins, composed of deluvial and loess-like loams, underlain by terrigenous rocks of 
the Lower Cretaceous and Jurassic. Since ancient times, the structure of geosystems has been dominated by 
meadow-steppe geosystems with steppe oak forests. The infiltration potential of geosystems is low-10—20 
mm per year. Groundwater and interstratal waters are fed by inflow from adjacent aquifers, unloading occurs 
through lateral outflow. 

e. Ancient troughs of glacial waters runoff, composed of fluvioglacial and glacial sediments with coniferous 
and coniferous-deciduous forests with high infiltration potential-120-130 mm per year, underlain by 
terrigenous rocks. The features of the lithogenic base determine the functioning of the groundwater and 
interstratal water recharge area. Groundwater seeps out in the bottoms of the ravines, in the under- 
watercourse streams of river valleys. 

f. Geosystems of water-glacial plains are composed of fluvioglacial sands underlain by carbonate rocks. 
Geosystems of coniferous and mixed forests are characterized by a high infiltration potential-120—-130 mm 
per year. In hydrodynamic terms, infiltration and inflation processes prevail. The thickness of the flooded 
strata with fresh waters ranges from 20 to 250 m. The filtration coefficient varies within significant limits- 
from 3.3 to 80 m/day. This is the feeding area of an aquiferous Carboniferous-Permian carbonate horizon. 

g. In valley geosystems, the above-floodplain terraces, river valleys bedrock sides and floodplains are 
distinguished according to the peculiarities of their functioning. Infiltration potential varies widely: from 
coniferous and mixed forests—120—130 mm per year; forest-steppe—10—120 mm per year and floodplains. 
The mapping results showed that the most active groundwater discharge along the bedrock sides of river 
valleys occurs in the altitude range from 120 to 250 m, with a maximum concentration at absolute elevations 
of 141-210 m. The glacier-water plains, especially in the Moksha and Alatyr interfluve, are distinguished 
by the smallest spring runoff. 

Thus, the difference in the territory landscape conditions determines a significant variation in the average 
long-term infiltration potential (from 10 to 160 mm/year). The indicator maximum values are typical for forest 
geosystems of ancient sandy troughs of glacial water runoff and outlier-watershed spaces composed of siliceous- 
carbonate rocks. The revealed features of natural differentiation make it possible to determine the main zones of 
ecological balance. 
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Regional zones of ecological equilibrium are projected in landscapes of mixed forests of water-glacial 
plains in the central part of the Vad basin, the right bank of the Moksha and the left bank of the Alatyr and forming 
a node of the zones of ecological equilibrium in the Moksha-Alatyr interfluve. The importance of these zones is 
great due to the active participation of the allocated geosystems in ensuring the replenishment of groundwater 
resources, mainly used for centralized water supply of Mordovia, an aquiferous coal-Permian carbonate complex, 
with predicted fresh water resources—1584.9 thousand m?/day. The aquifer of the mid-Quaternary-modern alluvial 
horizon—9.0 thousand m?/day. The landscapes of deciduous forests of the erosion-denudation plains of the Sura- 
Alatyr interfluve should become zones of ecological balance of republican significance. Their importance is high 
from the standpoint of maintaining the balance of water balance and replenishing the resources of the upper 
cretaceous carbonate-terrigenous aquiferous complex, with reserves of 152.1 thousand m3/day. Frame elements 
of regional significance of forest geosystems of near-watershed spaces of secondary moraine plains should be 
coordinated with zones of regional and republican significance. Zones of ecological balance should form a single 
system of ecological "corridors" that will ensure the groundwater reserves replenishment and the surface runoff 
regulation. 


5. CONCLUSION 

Hierarchical structuring of geosystems optimizes diagnostics of the leading agents of physical and 
geographical factors interaction, patterns of spatio-temporal changes in their states, direction of metabolic 
processes and transformation of matter and energy development.The article proposes a system of geoinformation 
methods and algorithms for complex interpretation of remote sensing data, which makes it possible to form 
ensembles of classifiers based on the Ensemble Learning methodology in order to assess the stability of 
geosystems and predict exogeodynamic processes. The difference between the proposed approach lies in a 
fundamentally new model of the organization of the metaclassifier and the application of the geosystem approach 
to the preparation of data for machine analysis. 

Conjugated analysis of spatial data from several sources makes it possible to optimize the operational 
diagnostics of the development of exogeodynamic processes, and it is advisable to automate this process by 
forming ensembles of classifiers, by sequentially solving problems: forming a set of mono-classifiers, determining 
a metaclassifier algorithm, training mono- and metaclassifiers, evaluating the effectiveness of an ensemble and 
its individual monomodels. The use of ensembles makes it possible to quickly analyze geosystems in order to 
analyze the development of natural and natural-man-made processes and phenomena. 

The article describes a new technique for constructing convolutional neural networks that are effective 
in the analysis of large spatio-temporal data and determines strategies for configuring sets and dimensions of 
convolution and subdescritization layers, dimension reduction algorithms. A solution is proposed to the problem 
of classifying high-resolution remote sensing data by applying deep learning methods and algorithms in 
conditions of a shortage of tagged data. For the first time, it was proposed to solve the posed scientific problem 
by means of the geosystem approach, by analyzing the genetic homogeneity of spatially adjacent objects of 
different scales and hierarchical levels. The advantages of the proposed model lie in a large number of degrees of 
freedom, allowing flexible configuration of the model depending on the task at hand. Testing proposed model for 
the classification of the ERS images dataset, algorithmically augmented using the geosystem approach, 
demonstrated the ability to improve the classification accuracy in conditions of a shortage of labeled data by 9% 
and to obtain the classification accuracy with a large amount of training data (by 2%), which is slightly inferior 
in comparison with other deep models. 

The combination of factors variety in the water balance formation forms many types of water resources 
conditions formation. For their study, it is proposed to use a systematic approach as the most important 
methodological tool in understanding the geographic shell structure. The features of this approach are based on 
the following states: i) natural and natural-man-made geosystems related to complex open systems, consist of 
certain elements that are interconnected by direct and feedback; impacts on individual elements or structural 
connections (matter and energy flows) cause chain reactions leading to a change in the states of geosystems, 
ii) the basis of the geosystems integrity formed by the exchange of matter and energy in geosystems; as holistic 
formations, they react in the same way to external influences, and iii) incessant exchange of matter and energy is 
accompanied by changes in the space-time structure of geosystems; these changes velocity is not the same, and 
this is reflected in the metachronism of changes in their states. 
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